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DETAILED ACTION 



Claim Rejections - 35 USC § 102 



The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351 (a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 



Claims 1 , 7-18, 20, and 22-28 are rejected under 35 U.S.C. 102(e) as being 
anticipated by Mizuno et al. (US Patent No. 6226614). 

1 . Regarding claim 1 , Mizuno et al. disclose a method for modifying synthesized 
speech, the method including the steps of: 

generating synthesized speech based on textual input and a plurality of run-time 
control parameter values (col. 8, In. 16-52 or referring to figures 2 and 3); 

generating real-time data based on an input signal, the input signal 
characterizing an intelligibility of the speech with regard to a listener (col. 8, In. 26-39); 
and 

modifying one or more of the run-time control parameter values based on the 
real-time data such that the intelligibility of the speech increases (col. 8, In. 53 to col. 9, 
In. 54 or referring to figure 3). 
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2. Regarding claim 22, Mizuno et al. disclose a method for modifying one or more 
speech synthesizer run-time control parameters, the method comprising the steps of: 

receiving the real-time data (col. 8 In. 53 to col. 9, In. 19); 

identifying relevant characteristics of the speech based on the real-time data, the 
relevant characteristics having corresponding run-time control parameters (col. 9, In. 20- 
34); and 

applying adjustment values to parameter values of the control parameters such 
that the relevant characteristics of the speech change in a desired fashion (col. 9, In. 20- 
34). 

3. Regarding claim 7, Mizuno et al. further disclose the steps of: 
receiving the real-time data (col. 8 In. 53 to col. 9, In. 19); 

identifying relevant characteristics of the speech based on the real-time data, the 
relevant characteristics having corresponding run-time control parameters (col. 9, In. 20- 
34); and 

applying adjustment values to parameter values of the control parameters such 
that the relevant characteristics of the speech change in a desired fashion (col. 9, In. 20- 
34). 



4. Regarding claims 8 and 23, Mizuno et al. further disclose the step of changing 
relevant speaker characteristics of the speech (col. 7, In. 38-67 and col. 9, In. 20-54). 
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5. Regarding claims 9 and 24, Mizuno et al. further disclose the step of changing 
relevant voice characteristics of the speech (col. 7, In. 38-67 and col. 9, In. 20-54). 

6. Regarding claim 10, Mizuno et al. further disclose that the step of changing 
characteristics is selected from the group consisting essentially of: speech rate, pitch, 
volume, parametric equalization, formant frequencies and bandwidths, glottal sources, 
speech power spectrum tilt, gender, age, and identity (col. 7, In. 38-67 and col. 9, In. 20- 
54, these are regarded as prosody information). 

7. Regarding claimsl 1 and 25 Mizuno et al. further disclose the step of changing 
relevant speaking style characteristics of the speech (col. 7, In. 38-67 and col. 9, In. 20- 
54, angry or glad voices are different style characteristics). 

8. Regarding claim 12, Mizuno et al. further disclose the step of changing 
characteristics is selected from the group consisting essentially of: dynamic prosody; 
and articulation (col. 7, In. 38-67 and col. 9, In. 20-54, changing the pitch of words can 
yield articulating speech). 

9. Regarding claims 13 and 26, Mizuno et al. further disclose the step of changing 
relevant emotion characteristics of the speech (col. 7, In. 38-67 and col. 9, In. 20-54). 
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10. Regarding claim 14, Mizuno et al. further disclose the step of changing an 
urgency characteristic of the speech (col. 7, In. 38-67 and col. 9, In. 20-54, is considered 
as prosody information). 

1 1 . Regarding claims 15 and 27, Mizuno et al. further disclose the step of changing 
relevant dialect characteristics of the speech (col. 7, In. 38-67 and col. 9, In. 20-54, 
changing the pitch information can produce a different accent and consequently yields 
dialect characteristics). 

12. Regarding claim 16, Mizuno et al. further disclose the step of changing 
characteristics selected from the group consisting essentially of: pronunciation; and 
articulation (col. 7, In. 38-67 and col. 9, In. 20-54, prosody information). 

13. Regarding claims 17 and 28, Mizuno et al. further disclose the step of changing 
relevant content characteristics of the speech (col. 7, In. 38-67 and col. 9, In. 20-54). 

14. Regarding claim 18, Mizuno et al. further disclose the step of changing 
characteristics selected from the group consisting essentially of: repetition; redundancy; 
and vocabulary (col. 7, In. 38-67 and col. 9, In. 20-54, that is the intention characteristics 
of the speech). 
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15. Regarding claim 20, Mizuno et al. further disclose the step of generating the real- 
time data based on listener input (col. 8, In. 26-39). 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

Claims 19 and 21 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Mizuno et al. (US Patent No. 6226614) in view of Logan et al. (US Patent No. 
6199076). 

16. Regarding claim 19, Mizuno et al. fail to disclose the step of using polyphonic 
audio processing to spatially reposition the speech based on the real-time data. 
However, Logan et al. teach the step of using polyphonic audio processing to spatially 
reposition the speech based on the real-time data (col. 5, In. 13-25). The advantage of 
using the teaching of Logan et al. in Mizuno et al. is to play several sounds at the same 
time. 

Since Mizuno et al. and Logan et al. are analogous art because they are from the 
same field of endeavors, it would have been obvious to one of ordinary skill in the art at 
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the time the invention was made to modify Mizuno et al. by incorporating the teaching of 
Logan et al. in order to play several sounds at the same time. 

17. Regarding claim 21 , Mizuno et al. fail to disclose the step of using the 
synthesized speech in an automotive application. However, Logan et al. teach the step 
of using the synthesized speech in an automotive application (col. 36, In. 37-47). The 
advantage of using the teaching of Logan et al. in Mizuno et al. is to provide audible 
response to driver to minimize driving distraction. 

Since Mizuno et al. and Logan et al. are analogous art because they are from the 
same field of endeavors, it would have been obvious to one of ordinary skill in the art at 
the time the invention was made to modify Mizuno et al. by incorporating the teaching of 
Logan et al. in order to provide audible response to driver to minimize driving 
distraction. 

Claims 2, and 29-30 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Mizuno et al. (US Patent No. 6226614) in view of Graciotti et al. (US Patent No. 
3903302). 

18. Regarding to claim 2, Mizuno et al. disclose the step of generating the real-time 
data (col. 8, In. 26-39), but fail to specifically disclose that generating the real-time data 
is based on background noise contained in an environment in which the speech is 
reproduced. However, Graciotti et al. teach that generating the real-time data is based 
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on background noise contained in an environment in which the speech is reproduced 
(col. 4, In. 16-26). The advantage of using the teaching of Graciotti et al. in Mizuno et 
al. is to produce a suitable speech signal level to increase intelligibility for listeners. 

Since Mizuno et al. and Logan et al. are analogous art because they are from the 
same field of endeavors, it would have been obvious to one of ordinary skill in the art at 
the time the invention was made to modify Mizuno et al. by incorporating the teaching of 
Graciotti et al. in order to produce a suitable speech signal level to increase intelligibility 
for listeners. 

19. Regarding to claim 29, Mizuno et al. disclose a speech synthesizer adaptation 
system comprising: 

a text-to-speech synthesizer for generating speech based on textual input and a 
plurality of run-time control parameter values (col. 8, In. 16-52 or referring to figures 2 
and 3) and an adaptation controller operatively coupled to the synthesizer and the audio 
input system, the adaptation controller for modifying one or more of the run-time control 
parameter values (col. 17, In. 55-67 or referring to element 17 of figure 12). 

Mizuno et al. fails to specifically disclose an audio input system for generating 
real-time data based on background noise contained in an environment in which the 
speech is reproduced and the adaptation controller for modifying one or more of the 
run-time control parameter values based on the real-time data such that interference 
between the background noise and the speech is reduced. 
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However, Graciotti et al. teach an audio input system for generating real-time 
data based on background noise contained in an environment in which the speech is 
reproduced (col. 4, In. 16-26, teaching a method for controlling the volume based on the 
background noise in the environment which speech is produced) and the adaptation 
controller for modifying one or more of the run-time control parameter values based on 
the real-time data such that interference between the background noise and the speech 
is reduced (col. 4, In. 16-26, by adjusting the volume of the signal). The advantage of 
using the teaching of Graciotti et al. in Mizuno et al. is to produce a suitable speech 
signal level to increase intelligibility for listeners. 

Since Mizuno et al. and Logan et al. are analogous art because they are from the 
same field of endeavors, it would have been obvious to one of ordinary skill in the art at 
the time the invention was made to modify Mizuno et al. by incorporating the teaching of 
Graciotti et al. in order to produce a suitable speech signal level to increase intelligibility 
for listeners. 

20. Regarding to claim 30, the modified Mizuno et al. fail to specifically disclose that 
the audio input system includes an acoustic-to-electric signal converter. However, it 
would have been obvious to one of ordinary skill in the art that any microphone has an 
acoustic-to-electric signal converter in order to acquire speech signal from users. 
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Claims 3-5, 29, and 30 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Mizuno et al. (US Patent No. 6226614) in view of Graciotti et al. (US 
Patent No. 4903302) and further in view of Goldberg et al. (US Patent No. 5970446). 

21 . Regarding claim 3, the modified Mizuno et al. fail to disclose the steps of: 
converting the background noise into an electrical signal; retrieving one or more 
interference models from a model database; and characterizing the background noise 
with the real-time data based on the electrical signal and the interference models. 

However, Goldberg et al. teach converting the background noise into an 
electrical signal (col. 4, In. 5-7, "recording indicates the conversion of sound signal into 
electrical signal); retrieving one or more interference models from a model database 
(col. 3, In. 54-64); and characterizing the background noise with the real-time data 
based on the electrical signal and the interference models (col. 4, In. 1-67). The 
advantage of using the teaching of Goldberg et al. in the modified Mizuno et al. is to 
reduce noisy background signal to enhance the recognition accuracy. 

Since the modified Mizuno et al. and Goldberg et al. are analogous art because 
they are from the same field of endeavors, it would have been obvious to one of 
ordinary skill in the art at the time the invention was made to further modify Mizuno et al. 
by incorporating the teaching of Goldberg et al. in order to reduce noisy background 
signal to enhance the recognition accuracy. 
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22. Regarding to claims 4-5, the modified Mizuno et al. fail to specifically disclose the 
step of performing a time domain and frequency domain analysis on the electrical 
signal. However, Goldberg et al. further teach different methods of analysis maybe 
used on the electrical signal (col. 2, In. 44-47). It would have been obvious to one of 
ordinary skill in the art that the noise signal can be analyzed by using the time-domain 
or frequency-domain analysis techniques in order to study characteristics of the 
recorded noise. 

Claim 6 is rejected under 35 U.S.C. 103(a) as being unpatentable over Mizuno et 
al. (US Patent No. 6226614) in view of Graciotti et al. (US Patent No. 4903302), further 
in view of Goldberg et al. (US Patent No. 5970446), and further in view of Lazar (US 
Patent No. 5818389). 

23. Regarding claim 6, the modified Mizuno et al. fail to specifically disclose the 
characterizing step is selected from the group consisting essentially of the steps of: 
identifying high level interference in the background noise; identifying low level 
interference in the background noise; identifying momentary interference in the 
background noise; identifying continuous interference in the background noise; 
identifying varying interference in the background noise; identifying stationary 
interference in the background noise; identifying spatial locations of sources of the 
background noise; identifying potential sources of the background noise; and identifying 
speech in the background noise. 
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However, Lazar teaches identifying high level interference in the background 
noise (col. 9, In. 32-37); identifying low level interference in the background noise (col. 
10, In. 35-39); identifying momentary interference in the background noise (col. 3, In. 
29-38, varying interference can be considered as momentary interference); identifying 
varying interference in the background noise (col. 3, In. 29-38); identifying spatial 
locations of sources of the background noise (col. 10, In. 6-9); identifying potential 
sources of the background noise (col. 5, In. 30-33 or col. 6, In. 46-52); and identifying 
speech in the background noise (col. 9, In. 1-15). The advantage of using the teaching 
of Lazar in the modified Mizuno et al. is to identify types of interference so that the 
system can take appropriate action to produce intelligible sound to listeners. 

Since the modified Mizuno et al. and Lazar are analogous art because they are 
from the same field of endeavors, it would have been obvious to one of ordinary skill in 
the art at the time the invention was made to further modify Mizuno et al. by 
incorporating the teaching of Lazar in order to identify types of interference so that the 
system can take appropriate action to produce intelligible sound to listeners. 

The modified Mizuno et al. still fail to specifically disclose identifying continuous 
and stationary interferences in the background noise. However, it would have been 
obvious to one of ordinary skill in the art that if the detected varying interference is not 
varying, then it is the interference is continuous and stationary. 



Conclusion 
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The prior art made of record and not relied upon is considered pertinent to 
applicants disclosure. Silverman (US Patent No. 5751906), Acero (US Patent No. 
6253182), and Gasper et al. (US Patent No. 5278943) teach a method for modifying 
synthesized speech to achieve desired characteristics that are considered pertinent to 
the claimed invention. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Huyen Vo whose telephone number is 703-305-8665. 
The examiner can normally be reached on M-F, 9-5:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Doris To can be reached on 703-305-4827. The fax phone number for the 
organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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